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Abstract 

DTG are designed to share some of the ad- 
vantages of TAG while overcoming some of 
its limitations. DTG involve two composi- 
tion operations called subsertion and sister- 
adjunction. The most distinctive feature of 
DTG is that, unlike TAG, there is complete 
uniformity in the way that the two DTG op- 
erations relate lexical items: subsertion al- 
ways corresponds to complementation and 
sister-adjunction to modification. Further- 
more, DTG, unlike TAG, can provide a uni- 
form analysis for wh- movement in English 
and Kashmiri, despite the fact that the wh 
element in Kashmiri appears in sentence- 
second position, and not sentence-initial 
position as in English. 



1 Introduction 

We define a new grammar formalism, called D-Tree 
Grammars (DTG), which aris es from work on T ree- 
Adjoining Grammars (TAG) (Joshi ct al., 1975). A 
salient feature of TAG is the extended domain of lo- 
cality it provides. Each elementary structure can 
be associated with a lexical item (as in Lexicalized 
TAG (LTAG) ([Joshi 



to overcome these problems while remaining faith- 
ful to what we see as the key advantages of TAG (in 
parti cula r, its enlarged domain of locality). In Sec- 
tion 1.3 we introduce some of the key features of 



Schabes, 1991)). Properties 
related to the lexical item (such as subcatcgoriza- 
tion, agreement, certain types of word order varia- 
tion) can be expre s sed within th e elementary struc- 
ture QKroch, 1987) ; [Frank, 19921) . In addition, TAG 



remain tractable, yet their generative capacity is suf- 
ficient to account for certain syntactic phenomena 
that, it has been argued, lie b eyond Context-Free 
Grammars (CFG) ( Shicbcr, 1985| ). TAG, however, has 
two limitations which provide the motivation for this 
work. The first problem (discussed in Section |l . l| ) 
is that the TAG operations of substitution and ad- 
junction do not map cleanly onto the relations of 
complementation and mod ifica tion. A second prob- 
lem (discussed in Section 1.2) has to do with the 
inability of TAG to provide analyses for certain syn- 
tactic phenomena. In developing DTG we have tried 



DTG and explain how they are intended to address 
the problems that we have identified with TAG. 

1.1 Derivations and Dependencies 

In LTAG, the operations of substitution and adjunc- 
tion relate two lexical items. It is therefore natural 
to interpret these operations as establishing a di- 
rect linguistic relation between the two lexical items, 
namely a relation of complementation (predicate- 
argument relation) or of modification. In purely 
CFG-based approaches, these relations are only im- 
plicit. However, they represent important linguis- 
tic intuition, they provide a uniform interface to se - 
mantics, and they are, as Schabes & Shieber ( 1994 ) 
argue, important in order to support statistical pa- 
rameters in stochastic frameworks and appropriate 
adjunction constraints in TAG. In many frameworks, 
complementat ion and modification are i n fact made 
explicit: LFG (Brcsnan & Kaplan, 1982) provides a 
separate functional (f-) structure, and dependency 
grammars (see e.g. Mel'cuk (1988)) use these no- 
tions as the principal basis for syntactic represen- 
tation. We will follow the dependency literature 
in referring to complementation and modification 
as syntactic depe ndency. As observed by Rambow 
and Joshi ( 1992j ), for TAG, the importance of the 
dependency structure means that not only the de- 
rived phrase-structure tree is of interest, but also 
the operations by which we obtained it from ele- 
mentary structures. This information is encoded in 



the derivation tree ( Vijay-Shanker, 1987 ). 

However, as Vijay-Shanker ( 1992 ) observes, the 
TAG composition operations are not used uniformly: 
while substitution is used only to add a (nominal) 
complement, adjunction is used both for modifica- 
tion and (clausal) complementation. Clausal com- 
plementation could not be handled uniformly by 
substitution because of the existence of syntactic 
phenomena such as long-distance w/i- movement in 
English. Furthermore, there is an inconsistency in 



the directionality of the operations used for comple- 
mentation in TAG@: nominal complements are sub- 
stituted into their governing verb's tree, while the 
governing verb's tree is adjoined into its own clausal 
complement. The fact that adjunction and substitu- 
tion are used in a linguistically heterogeneous man- 
ner means that (standard) TAG derivation trees do 
not provide a good representation of the dependen- 
cies between the words of the sentence, i.e., of the 
predicate-argument and modification structure. 
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Figure 1: Derivation trees for ([!]): original definition 
(left); Schabes & Shieber definition (right) 

For instance, English sentence (Q) gets the deriva- 
tion structure shown on the left in Figure |]|]. 
(1) Small spicy hotdogs he claims Mary seems to adore 

When comparing this derivation structure to the 
dependency structure in Figure || the following 
problems become apparent. First, both adjectives 
depend on hotdog, while in the derivation structure 
small is a daughter of spicy. In addition, seem de- 
pends on claim (as does its nominal argument, he), 
and adore depends on seem. In the derivation struc- 
ture, seem is a daughter of adore (the direction does 
not express the actual dependency) , and claim is also 
a daughter of adore (though neither is an argument 
of the other) . 
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Figure 2: Dependency tree for ([!]) 



Schabes & Shieber (1994) solve the first problem 



For clarity, we depart from standard TAG notational 
practice and annotate nodes with lexemes and arcs with 
grammatical function. 



by distinguishing between the adjunction of modi- 
fiers and of clausal complements. This gives us the 
derivation structure shown on the right in Figure [l]. 
While this might provide a satisfactory treatment of 
modification at the derivation level, there are now 
three types of operations (two adjunctions and sub- 
stitution) for two types of dependencies (arguments 
and modifiers), and the directionality problem for 
embedded clauses remains unsolved. 

In defining DTG we have attempted to resolve 
these problems with the use of a single operation 
(that we call subsertion) for handling all comple- 
mentation and a second operation (called sister- 
adjunction) for modification. Before discussion 
these operations further we consider a second prob- 
lem with TAG that has implications for the design 
of these new composition operations (in particular, 
subsertion) . 

1.2 Problematic Constructions for TAG 

TAG cannot be used to provide suitable analyses 
for certain syntactic phenome na, including long - 
distance scramblin g in German ( Becker ct al., 1991 ), 
Romance Clitics ( Bleam, 1994J) , w/t-ex traction out 
of complex picture-NPs flKroch, 1987j ), and Kash- 
miri w/i-extraction (presented here). The problem 
in describing these phenomena with TA G arise s from 
the fact (observed by Vijay-Shanker ( 1992 )) that 
adjoining is an overly restricted way of combining 
structures. We illustrate the problem by cons iderin g 
Kashmiri w/i-extraction, drawing on Bhatt ( 1994[ ). 
W7i-extraction in Kashmiri proceeds as in English, 
except that the w/i-word ends up in sentence-second 
position, with a topic from the matrix clause in 
sentence-initial position. This is illustrated in (||a) 
for a simple clause and in (|2|b) for a complex clause. 

(2) a. rameshan kyaa dyutnay tse 

RameshERG whatNOM gave youDAT 
What did you give Ramesh? 



kyaai chu baasaan 
what is believeNPerf 



ki 

that 



b. rameshan 
RameshERG 
me kor ti] 
Ierg do 

What does Ramesh believe that I did? 

Since the moved element does not appear in 
sentence-initial position, the T A G ana lysis of English 
w/i-extraction of Kroch ( 1987 ; 1989 ) (in which the 
matrix clause is adjoined into the embedded clause) 
cannot be transferred, and in fact no linguistically 
plausible TAG analysis appears to be available. 

In the past, variants of TAG have been devel- 
oped to extend the range of pos sible analys es. In 
Multi- Component TAG (MCTAG) fljoshi, 1987) ), trees 
are grouped into sets which must be adjoined to- 
gether (multicomponent adjunction). However, MC- 
TAG lack expressive power since, while syntactic re- 
lations are invariably subject to c-command or dom- 
inance constraints, there is no way to state that 



two trees from a set must be in a dominance re- 
lation in the derived tree. MCTAG with Domination 



Links (MCTAG-DL) ( [Becker ct al., 199l| ) are multi- 
component systems that allow for the expression of 
dominance constraints. However, MCTAG-DL share a 
further problem with MCTAG: the derivation struc- 
tures cannot be given a linguistically meaningful in- 
terpretation. Thus, they fail to ad dress the first 
problem we discussed (in Section 1.1). 



1.3 The DTG Approach 



Vijay-Shanker ( 1992 ) points out that use of ad- 
junction for clausal complementation in TAG corre- 
sponds, at the level of dependency structure, to sub- 
stitution at the foot nodegof the adjoined tree. How- 
ever, adjunction (rather than substitution) is used 
since, in general, the structure that is substituted 
may only form part of the clausal complement: the 
remaining substructure of the clausal complement 
appears above the root of the adjoined tree. Un- 
fortunately, as seen in the examples given in Sec- 
tion [T^, there are cases where satisfactory analyses 
cannot be obtained with adjunction. In particular, 
using adjunction in this way cannot handle cases in 
which parts of the clausal complement are required 
to be placed within the structure of the adjoined 
tree. 

The DTG operation of subsertion is designed to 
overcome this limitation. Subsertion can be viewed 
as a generalization of adjunction in which com- 
ponents of the clausal complement (the subserted 
structure) which are not substituted can be inter- 
spersed within the structure that is the site of the 
subsertion. Following ear lier work ( Becker et al. J 



1991; Vijay-Shanker, 1992), DTG provide a mecha- 
nism involving the use of domination links (d-edges) 
that ensure that parts of the subserted structure 
that are not substituted dominate those parts that 
are. Furthermore, there is a need to constrain the 
way in which the non-substituted components can 
be interspersed^]. This is done by either using ap- 
propriate feature constraints at nodes or by means 
of subsertion-insertion constraints (see Section ^|) . 

We end this section by briefly commenting on the 
other DTG operation of sister-adjunction. In TAG, 
modification is performed with adjunction of mod- 
ifier trees that have a highly constrained form. In 
particular, the foot nodes of these trees are always 
daughters of the root and either the leftmost or 
rightmost frontier nodes. The effect of adjoining a 



In these cases the foot node is an argument node of 
the lexical anchor. 

3 



This was also observed by Rambow ( 1994a ), where 
an integ rity constraint (first defined for an ID/ LP version 
of TAG ( [Becker et al 199l| )) is defined for a MCTAG-DL 
version called V-TAG. However, this was found to be in- 
sufficient for treating both long-distance scrambling and 
long-distance topicalization in German. V-TAG retains 
adjoining (to handle topicalization) for this reason. 



tree of this form corresponds (almost) exactly to the 
addition of a new (leftmost or rightmost) subtree be- 
low the node that was the site of the adjunction. For 
this reason, we have equipped DTG with an opera- 
tion (sister-adjunction) that does exactly this and 
nothing more. From the definition of DTG in Sec- 
tion it can be see n tha t the essential aspects of 
Schabes & Shieber (1994) treatment for modifica- 
tion, including multiple modifications of a phrase, 
can be captured by using this operation^. 

After defining DTG in Section 0, we discuss, in 
Section [|, DTG analyses for the English and Kash- 
miri data presented in this section. Section ^| briefly 
discusses DTG recognition algorithms. 

2 Definition of D-Tree Grammars 

A d-tree is a tree with two types of edges: domi- 
nation edges (d-edges) and immediate domination 
edges (i-edges). D-edges and i-edges express domi- 
nation and immediate domination relations between 
nodes. These relations are never rescinded when d- 
trees are composed. Thus, nodes separated by an 
i-edge will remain in a mother-daughter relationship 
throughout the derivation, whereas nodes separated 
by an d-edge can be equated or have a path of any 
length inserted between them during a derivation. 
D-edges and i-edges are not distributed arbitrarily 
in d-trees. For each internal node, either all of its 
daughters are linked by i-edges or it has a single 
daughter that is linked to it by a d-edge. Each node 
is labelled with a terminal symbol, a nonterminal 
symbol or the empty string. A d-tree containing n 
d-edges can be decomposed into n + 1 components 
containing only i-edges. 

D-trees can be composed using two operations: 
subsertion and sister-adjunction. When a d-tree 
a is subserted into another d-tree f3, a component of 
a is substituted at a frontier nonterminal node (a 
substitution node) of (3 and all components of a 
that are above the substituted component are in- 
serted into d-edges above the substituted node or 
placed above the root node. For example, consider 
the d-trees a and (3 shown in Figure p] Note that 
components are shown as triangles. In the com- 
posed d-tree 7 the component a(5) is substituted 
at a substitution node in (3. The components, a(l), 
a(2), and cv(4) of a above cv(5) drift up the path 
in (3 which runs from the substitution node. These 
components are then inserted into d-edges in (3 or 
above the root of (3. In general, when a component 
a(i) of some d-tree a is inserted into a d-edge be- 
tween nodes 771 and r]2 two new d-edges are created, 
the first of which relates r\\ and the root node of 
a(i), and the second of which relates the frontier 



Santorini and Mahootian fll995| ) provide additional 
evidence against the standard TAG approach to modifi- 
cation from code switching data, which can be accounted 
for by using sister-adjunction. 




Figure 3: Subsertion 



node of a(i) that dominates the substituted com- 
ponent to 772 ■ It is possible for components above 
the substituted node to drift arbitrarily far up the 
d-tree and distribute themselves within domination 
edges, or above the root, in any way that is compat- 
ible with the domination relationships present in the 
substituted d-tree. DTG provide a mechanism called 
subsertion-insertion constraints to control what 
can appear within d-edges (see below). 

The second composition operation involving d- 
trees is called sister-adjunction. When a d-tree a is 
sister-adjoined at a node r\ in a d-tree (3 the com- 
posed d-tree 7 results from the addition to j3 of 
a as a new leftmost or rightmost sub-d-tree below 
r\. Note that sister-adjunction involves the addition 
of exactly one new immediate domination edge and 
that several sister-adjunctions can occur at the same 
node. Sister-adjoining constraints specify where 
d-trees can be sister-adjoined and whether they will 
be right- or left-sister-adjoined (see below). 

A DTG is a four tuple G = (V N , V T , S, D) where 
Vn and Vt are the usual nonterminal and termi- 



nal alphabets, S £ Vn is a distinguished nonter- 
minal and D is a finite set of elementary d-trees. 
A DTG is said to be lexicalized if each d-tree in 
the grammar has at least one terminal node. The 
elementary d-trees of a grammar G have two addi- 
tional annotations: subsertion-insertion constraints 
and sister-adjoining constraints. These will be de- 
scribed below, but first we define simultaneously 
DTG derivations and subsertion-adjoining trees (SA- 
trees), which are partial derivation structures that 
can be interpreted as representing dependency in- 
formation, the importance of which was stressed in 
the introduction^]. 

Consider a DTG G = (V N ,V T ,S,D). In defining 
SA-trees, we assume some naming convention for the 
elementary d-trees in D and some consistent order- 
ing on the components and nodes of elementary d- 
trees in D. For each i, we define the set of d-trees 
Ti(G) whose derivations are captured by SA-trees of 
height i or less. Let To(G) De the set D of elemen- 
tary d-trees of G. Mark all of the components of each 
d-tree in Tq(G) as being substitutable^. Only com- 
ponents marked as substitutable can be substituted 
in a subsertion operation. The SA-tree for a £ ?o(G) 
consists of a single node labelled by the elementary 
d-tree name for a. 

For i > let T,(G) be the union of the set T 4 _i(G) 
with the set of all d-trees 7 that can be produced as 
follows. Let a £ D and let 7 be the result of sub- 
scrting or sister- adjoining the d-trees 71, ... , 7& into 
a where 71, . . . , 7^ are all in Ti_i(G), with the sub- 
sertions taking place at different substitution nodes 
in a as the footnote. Only substitutable components 
of 71, . . . , 7fc can be substituted in these subsertions. 
Only the new components of 7 that came from a are 
marked as substitutable in 7. Let t%, . . . , Tfc be the 
SA-trees for 71, ... ,7*,, respectively. The SA-tree t 
for 7 has root labelled by the name for a and k sub- 
trees t\ , . . . , Tfc . The edge from the root of r to the 
root of the subtree t; is labelled by li (1 < i < k) 
defined as follows. Suppose that % was subserted 
into a and the root of is labelled by the name of 
some a' £ D. Only components of a' will have been 
marked as substitutable in 7^. Thus, in this sub- 
sertion some component a'(j) will have been substi- 
tuted at a node in a with address n. In this case, the 
label h is the pair (j,n). Alternatively, 7; will have 

5 Due to space limitations, in the following definitions 
we are forced to be somewhat imprecise when we iden- 
tify a node in a derived d-tree with the node in the el- 
ementary d-trees (elementary nodes) from which it was 
derived. This is often done in TAG literature, and hope- 
fully it will be clear what is intended. 

6 We will discuss the notion of substitutability further 
in the next section. It is used to ensure the SA-tree 
is a tree. That is, an elementary structure cannot be 
subserted into more than one structure since this would 
be counter to our motivations for using subsertion for 
complementation. 



been d-sister-adjoined at some node with address n 
in a, in which case U will be the pair (d, n) where 
d £ { left, right}. 

The tree set T(G) generated by G is defined as 
the set of trees 7 such that: 7' £ ?i(G) for some i > 
0; 7' is rooted with the nonterminal 5; the frontier of 
7' is a string in V£ ; and 7 results from the removal of 
all d-edges from 7'. A d-edge is removed by merging 
the nodes at either end of the edge as long as they are 
labelled by the same symbol. The string language 
L(G) associated with G is the set of terminal strings 
appearing on the frontier of trees in T(G). 

We have given a reasonably precise definition of 
SA-trees since they play such an important role in 
the motivation for this work. We now describe infor- 
mally a structure that can be used to encode a DTG 
derivation. A derivation graph for 7 £ T(G) results 
from the addition of insertion edges to a SA-tree r 
for 7. The location in 7 of an inserted elementary 
component a(i) can be unambiguously determined 
by identifying the source of the node (say the node 
with address n in the elementary d-tree a') with 
which the root of this occurrence of a(i) is merged 
with when d-edges are removed. The insertion edge 
will relate the two (not necessarily distinct) nodes 
corresponding to appropriate occurrences of a and 
a' and will be labelled by the pair (i,n). 

Each d-edge in elementary d-trees has an associ- 
ated subsertion-insertion constraint (SIC). A SIC is a 
finite set of elementary node addresses (ENAs). An 
ENA 77 specifies some elementary d-tree a £ D, a 
component of a and the address of a node within 
that component of a. If a ENA 77 is in the SIC associ- 
ated with a d-edge between rji and 772 in an elemen- 
tary d-tree a then 77 cannot appear properly within 
the path that appears from 771 to 772 in the derived 
tree 7 eT(G). 

Each node of elementary d-trees has an associated 
sister-adjunction constraint (SAC). A SAC is a finite 
set of pairs, each pair identifying a direction (left or 
right) and an elementary d-tree. A SAC gives a com- 
plete specification of what can be sister-adjoined at 
a node. If a node 77 is associated with a SAC contain- 
ing a pair (d, a) then the d-tree a can be d-sister- 
adjoined at 77. By definition of sister-adjunction, 
all substitution nodes and all nodes at the top of 
d-edges can be assumed to have SACs that are the 
empty-set. This prevents sister- adjunction at these 
nodes. 

In this section we have defined "raw" DTG. In a 
more refined version of the formalism we would as- 
sociate (a single) finite- valued feature structure with 
each nodeQ It is a matter of further research to de- 
termine to what extent SICs and SACs can be stated 
globally for a grammar, rather than being attached 



to d-edges/ no des^. See the next section for a brief 
discussion of linguistic principles from which a gram- 
mar's SICs could be derived. 

3 Linguistic Examples 

In this section, we show how an account for the data 
introduced in Section [l] can be given with DTG. 

3.1 Getting Dependencies Right: English 




NP VP [fin: +] 

(Mary) j 

VP [fin: 




to adore e 

Figure 4: D-trees for 

In Figure |^, we give a DTG that generates sen- 
tence (tfl). Every d-tree is a projection from a lexical 
anchor. The label of the maximal projection is, we 
assume, determined by the morphology of the an- 
chor. For example, if the anchor is a finite verb, it 
will project to S, indicating that an overt syntactic 
("surface") subject is required for agreement with 
it (and perhaps case-assignment). Furthermore, a 
finite verb may optionally also project to S' (as in 
the d-tree shown for claims), indicating that a wh- 
moved or topicalized element is required. The fi- 
nite verb seems also projects to S, even though it 
does not itself provide a functional subject. In the 
case of the to adore tree, the situation is the in- 
verse: the functional subject requires a finite verb 



7 Trees used in Section ^ make use of such feature 
structures. 



In this context, it might be beneficial to consider 
the expression of a feature-based lexicalist theory such 
as HP SG in DTG. similar to the compilation of HPSG to 
TAG ( |Kasper et al., 1995| ). 



to agree with, which is signaled by the fact that its 
component's root and frontier nodes are labelled S 
and VP, respectively, but the verb itself is not finite 
and therefore only projects to VP [-fin]. Therefore, 
the subject will have to raise out of its clause for 
agreement and case assignment. The direct object 
of to adore has w/i-moved out of the projection of 
the verb (we include a trace for the sake of clarity) . 



NP 



N' 



NP 



VP 



VP 



AdjP AdjP N he V s 

Adj Adj hotdogs claims NP 

II I ^] 

small spicy Mary seems VP 

/\ 

V NP 
to adore e 

Figure 5: Derived tree for (|l|) 

We add SICs to ensure that the projections are 
respected by components of other d-trees that may 
be inserted during a derivation. A SIC is associated 
with the d-edge between VP and S node in the seems 
d-tree to ensure that no node labelled S' can be in- 
serted within it - i.e., it can not be filled by with 
a wh-moved element. In contrast, since both the 
subject and the object of to adore have been moved 
out of the projection of the verb, the path to these 
arguments do not carry any SIC at alf]. 

We now discuss a possible derivation. We start 
out with the most deeply embedded clause, the 
adores clause. Before subserting its nominal argu- 
ments, we sister-adjoin the two adjectival trees to 
the tree for hotdogs. This is handled by a SAC asso- 
ciated with the N' node that allows all trees rooted 
in AdjP to be left sister-adjoined. We then sub- 
sert this structure and the subject into the to adore 
d-tree. We subsert the resulting structure into the 
seems clause by substituting its maximal projection 
node, labelled VP[fin: -], at the VP[fin: -] frontier 
node of seems, and by inserting the subject into the 
d-edge of the seems tree. Now, only the S node of 
the seems tree (which is its maximal projection) is 
substitutable. Finally, we subsert this derived struc- 



9 We enforce island effects for wh- movement by using 
a [iextract] feature on substitution nodes. This corre- 
sponds roughly to the analysis in TAG, where islandhood 
is (to a large extent) en forced by designating a particular 
node as the foot node ( Kroch fc Joshi, 1986 ) . 



ture into the claims d-tree by substituting the S node 
of seems at the S complement node of claims, and 
by inserting the object of adores (which has not yet 
been used in the derivation) in the d-edge of the 
claims d-tree above its S node. The derived tree is 
shown in Figure |5|. The SA-tree for this derivation 
corresponds to the dependency tree given previously 
in Figure ^. 

Note that this is the only possible derivation in- 
volving these three d-trees, modulo order of opera- 
tions. To see this, consider the following putative 
alternate derivation. We first subsert the to adore 
d-tree into the seems tree as above, by substituting 
the anchor component at the substitution node of 
seems. We insert the subject component of to adore 
above the anchor component of seems. We then sub- 
sert this derived structure into the claims tree by 
substituting the root of the subject component of to 
adore at the S node of claims and by inserting the S 
node of the seems d-tree as well as the object compo- 
nent of the to adore d-tree in the S'/S d-edge of the 
claims d-tree. This last operation is shown in Fig- 
ure |^. The resulting phrase structure tree would be 
the same as in the previously discussed derivation, 
but the derivation structure is linguistically mean- 
ingless, since to adore would have been subserted 
into both seems and claims. However, this deriva- 
tion is ruled out by the restriction that only substi- 
tutable components can be substituted: the subject 
component of the adore d-tree is not substitutable 
after subsertion into the seems d-tree, and therefore 
it cannot be substituted into the claims d-tree. 

Insertions S' 




seems V 



NP 



to adore 



Figure 6: An ill-formed derivation 



In the above discussion, substitutability played a 



central role in ruling out the derivation. We observe 
in passing that the SIC associated to the d-edge in 
the seems d-tree also rules out this derivation. The 
derivation requires that the S node of seems be in- 
serted into the S'/S d-edge of claims. However, we 
would have to stretch the edge over two components 
which arc both ruled out by the SIC, since they vio- 
late the projection from seems to its S node. Thus, 
the derivation is excluded by the independently mo- 
tivated SICs, which enforce the notion of projection. 
This raises the possibility that, in grammars that ex- 
press certain linguistic principles, substitutability is 
not needed for ruling out derivations of this nature. 
We intend to examine this issue in future work. 

3.2 Getting Word Order Right: Kashmiri 



VP 



top: + 
wn: - 
fin: 4 



[ S'-] NP VP 

(ramcshas) [ 
VP 

Aux 
(chu) 

NP 



top: 
wn: 
fin: 4 



VP 



VP 



G V 



baasaan 



VP 



top: 
wn: 
fin: 4 



Again, we will use the SICs to enforce the projec- 
tion from a lexical anchor to its maximal projection. 
Since the direct object of kor has wh-moved out of 
its clause, the d-edge connecting it to the maximal 
projection of its verb has no SIC. The d-edge con- 
necting the maximal projection of baasaan to the 
Aux component, however, has a SIC that allows only 
VP[wh: +, top: -] nodes to be inserted. 



NP 



ramcsha: 




top: 
wn: 
fin: 4 



VP L 

baasaan ^"^^^^^^ 
COMP VP 

ki NP 



VP 



NP 



VP 
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Figure 7: D-trees for (^b) 

Figure [7] shows the matrix and embedded clauses 
for sentence (||b). We use the node label VP 
throughout and use features such as top (for topic) to 
differentiate different levels of projection. Observe 
that in both trees an argument has been fronted. 



kor 



Figure 8: Derived d-tree for 

The derivation proceeds as follows. We first sub- 
sert the embedded clause tree into the matrix clause 
tree. After that, we subsert the nominal arguments 
and function words. The derived structure is shown 
in Figure @. The associated SA-tree is the desired, 
scmantically motivated, dependency structure: the 
embedded clause depends on the matrix clause. 

In this section, we have discussed examples where 
the elementary objects have been obtained by pro- 
jecting from lexical items. In these cases, we over- 
come both the problems with TAG considered in 
Section |l|. The SICs considered here enforce the 
same notion of projection that was used in obtain- 
ing the elementary structures. This method of arriv- 
ing at SICs not only generalizes for the English and 
Kashmiri examples but also appears to apply to the 
case of long-distance scrambling and topicalization 
in German. 



4 Recognition 

It is straightforward to adapt the polynomial-time 
CKY-style recognition algor ithm for a lexicalized 
UVG-DL of Rainbow ( |l994b|) for DTG. The entries 
in this array recording derivations of substrings of 
input contain a set of elementary nodes along with a 
multi-set of components that must be inserted above 
during bottom-up recognition. These components 
are added or removed at substitution and insertion. 
The algorithm simulates traversal of a derived tree; 
checking for SICs and SACs can be done easily. Be- 
cause of lexicalization, the size of these multi-sets is 
polynomially bounded, from which the polynomial 
time and space complexity of the algorithm follows. 

For practical purposes, especially for lexicalized 
grammars, it is preferable to incorporate some ele- 
ment of prediction. We are developing a polynomial- 
time Earley style parsing algorithm. The parser re- 
turns a parse forest encoding all parses for an input 
string. The performance of this parser is sensitive to 
the grammar and input. Indeed it appears that for 
grammars that lexicalize CFG and for English gram- 
mar (where the structures are similar t o the LTAG 



developed at Univer sity of Pennsylvania ( XTAG Re- 
search Group, we obtain cubic-time complcx- 
ity. 

5 Conclusion 

DTG, like other formalisms in the TAG family, is lex- 
icalizable, but in addition, its derivations are them- 
selves linguistically meaningful. In future work we 
intend to examine additional linguistic data, refin- 
ing aspects of our definition as needed. We will also 
study the formal properties of DTG, and complete 
the design of the Earley style parser. 
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